Questionnaire on Well-Being (QWB)

CFA

Author
Affiliation

Magnus Johansson

Published

2024-09-19

1 Reproducing CFA

Questionnaire on Well-Being (QWB), 18 items, each item is scored on a scale of 0 to 4 (Hlynsson et al. 2024). Data from the same paper. We’ll use the data from the second study that was used in the CFA in the paper.

Code and data were retrieved from the paper’s OSF page. Really great to see these materials made available, it is such an important step towards improving the standards of science!

Code
# Read in study two data -------------------------------------------------------
dd <- read_excel("data/study_two.xlsx")

onefactor <- 'f1 =~ swb1 + swb2 + swb3 + swb4 + swb5 + swb6 + swb7 + swb8 +
                    swb9 + swb10 + swb11 + swb12 + swb13 + swb14 + swb15 + 
                    swb16 + swb17 + swb18'

# Fit the model to the data
cfamodel <- sem(model = onefactor, data = dd, estimator = "WLSMV") 
Warning: lavaan->lav_options_est_dwls():  
   estimator "DWLS" is not recommended for continuous data. Did you forget to 
   set the ordered= argument?

This warning message is important! For WLSMV to work properly, one also needs to specify ordered = TRUE.

Let’s see if we can reproduce the fit metrics reported in the paper (p.15), using the output from the misspecified function call above.

Code
cfamodel %>% summary(standardized=T, ci=F, fit.measures= TRUE, )
lavaan 0.6-18 ended normally after 33 iterations

  Estimator                                       DWLS
  Optimization method                           NLMINB
  Number of model parameters                        36

                                                  Used       Total
  Number of observations                          1561        1795

Model Test User Model:
                                              Standard      Scaled
  Test Statistic                               603.028    1576.509
  Degrees of freedom                               135         135
  P-value (Chi-square)                           0.000       0.000
  Scaling correction factor                                  0.391
  Shift parameter                                           33.384
    simple second-order correction                                

Model Test Baseline Model:

  Test statistic                             40316.305    8701.238
  Degrees of freedom                               153         153
  P-value                                        0.000       0.000
  Scaling correction factor                                  4.698

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.988       0.831
  Tucker-Lewis Index (TLI)                       0.987       0.809
                                                                  
  Robust Comparative Fit Index (CFI)                         0.986
  Robust Tucker-Lewis Index (TLI)                            0.984

Root Mean Square Error of Approximation:

  RMSEA                                          0.047       0.083
  90 Percent confidence interval - lower         0.043       0.079
  90 Percent confidence interval - upper         0.051       0.086
  P-value H_0: RMSEA <= 0.050                    0.887       0.000
  P-value H_0: RMSEA >= 0.080                    0.000       0.892
                                                                  
  Robust RMSEA                                               0.052
  90 Percent confidence interval - lower                     0.049
  90 Percent confidence interval - upper                     0.054
  P-value H_0: Robust RMSEA <= 0.050                         0.106
  P-value H_0: Robust RMSEA >= 0.080                         0.000

Standardized Root Mean Square Residual:

  SRMR                                           0.053       0.053

Parameter Estimates:

  Standard errors                           Robust.sem
  Information                                 Expected
  Information saturated (h1) model        Unstructured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  f1 =~                                                                 
    swb1              1.000                               0.639    0.695
    swb2              0.861    0.035   24.575    0.000    0.550    0.544
    swb3              0.550    0.036   15.121    0.000    0.352    0.413
    swb4              0.876    0.034   25.809    0.000    0.560    0.647
    swb5              0.929    0.039   23.849    0.000    0.594    0.635
    swb6              0.967    0.040   24.054    0.000    0.618    0.656
    swb7              1.064    0.036   29.625    0.000    0.680    0.765
    swb8              1.142    0.034   33.522    0.000    0.730    0.808
    swb9              1.124    0.036   30.957    0.000    0.718    0.780
    swb10             1.182    0.040   29.564    0.000    0.755    0.751
    swb11             1.202    0.045   26.933    0.000    0.768    0.730
    swb12             0.872    0.037   23.385    0.000    0.557    0.654
    swb13             0.707    0.039   17.969    0.000    0.452    0.493
    swb14             0.980    0.035   27.907    0.000    0.626    0.666
    swb15             1.202    0.043   28.182    0.000    0.768    0.743
    swb16             1.104    0.036   30.255    0.000    0.706    0.715
    swb17             1.097    0.040   27.296    0.000    0.701    0.728
    swb18             0.981    0.045   21.770    0.000    0.627    0.587

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .swb1              0.436    0.018   24.187    0.000    0.436    0.517
   .swb2              0.721    0.025   28.874    0.000    0.721    0.704
   .swb3              0.601    0.024   24.670    0.000    0.601    0.829
   .swb4              0.435    0.017   25.780    0.000    0.435    0.581
   .swb5              0.523    0.019   27.276    0.000    0.523    0.597
   .swb6              0.506    0.019   26.097    0.000    0.506    0.570
   .swb7              0.327    0.014   22.718    0.000    0.327    0.414
   .swb8              0.283    0.012   23.155    0.000    0.283    0.347
   .swb9              0.332    0.013   25.432    0.000    0.332    0.392
   .swb10             0.441    0.018   24.399    0.000    0.441    0.436
   .swb11             0.518    0.022   23.263    0.000    0.518    0.468
   .swb12             0.415    0.017   24.855    0.000    0.415    0.572
   .swb13             0.634    0.022   28.570    0.000    0.634    0.757
   .swb14             0.493    0.019   26.580    0.000    0.493    0.557
   .swb15             0.480    0.020   24.323    0.000    0.480    0.449
   .swb16             0.476    0.019   25.534    0.000    0.476    0.489
   .swb17             0.436    0.019   22.826    0.000    0.436    0.470
   .swb18             0.749    0.028   26.616    0.000    0.749    0.656
    f1                0.408    0.026   15.931    0.000    1.000    1.000

The “standard” column in the output looks like what has been reported in the paper (see quote below) regarding χ2, RMSEA, and CFI. Good to see that it is reproducible.

A single-factor solution for the Confirmatory factor analysis for the Questionnaire on Well-Being. QWB resulted in a good fit for the data: χ2(135) = 603.03, p < 0.001, CFI = 0.988, SRMR = 0.053, RMSEA = 0.047 [90% CI: 0.043, 0.051]. Thus, our single- factor model for the QWB exhibits all of our predetermined criteria for a good model fit.

Let’s run the CFA function call with ordered = TRUE added to make the WLSMV estimator, which was correctly described in the paper, work as intended.

Code
cfamodel2 <- sem(model = onefactor, data = dd, estimator = "WLSMV", ordered = TRUE)
cfamodel2 %>% summary(standardized=T, ci=F, fit.measures= TRUE, )
lavaan 0.6-18 ended normally after 34 iterations

  Estimator                                       DWLS
  Optimization method                           NLMINB
  Number of model parameters                        90

                                                  Used       Total
  Number of observations                          1561        1795

Model Test User Model:
                                              Standard      Scaled
  Test Statistic                              1832.971    2940.978
  Degrees of freedom                               135         135
  P-value (Chi-square)                           0.000       0.000
  Scaling correction factor                                  0.630
  Shift parameter                                           32.413
    simple second-order correction                                

Model Test Baseline Model:

  Test statistic                            131413.959   40457.340
  Degrees of freedom                               153         153
  P-value                                        0.000       0.000
  Scaling correction factor                                  3.257

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.987       0.930
  Tucker-Lewis Index (TLI)                       0.985       0.921
                                                                  
  Robust Comparative Fit Index (CFI)                         0.831
  Robust Tucker-Lewis Index (TLI)                            0.808

Root Mean Square Error of Approximation:

  RMSEA                                          0.090       0.115
  90 Percent confidence interval - lower         0.086       0.112
  90 Percent confidence interval - upper         0.093       0.119
  P-value H_0: RMSEA <= 0.050                    0.000       0.000
  P-value H_0: RMSEA >= 0.080                    1.000       1.000
                                                                  
  Robust RMSEA                                               0.126
  90 Percent confidence interval - lower                     0.122
  90 Percent confidence interval - upper                     0.130
  P-value H_0: Robust RMSEA <= 0.050                         0.000
  P-value H_0: Robust RMSEA >= 0.080                         1.000

Standardized Root Mean Square Residual:

  SRMR                                           0.060       0.060

Parameter Estimates:

  Parameterization                               Delta
  Standard errors                           Robust.sem
  Information                                 Expected
  Information saturated (h1) model        Unstructured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  f1 =~                                                                 
    swb1              1.000                               0.740    0.740
    swb2              0.788    0.022   35.977    0.000    0.583    0.583
    swb3              0.614    0.029   21.080    0.000    0.454    0.454
    swb4              0.946    0.021   44.763    0.000    0.700    0.700
    swb5              0.944    0.022   43.672    0.000    0.699    0.699
    swb6              0.940    0.023   41.605    0.000    0.695    0.695
    swb7              1.123    0.020   56.041    0.000    0.831    0.831
    swb8              1.187    0.019   62.155    0.000    0.878    0.878
    swb9              1.112    0.019   59.591    0.000    0.823    0.823
    swb10             1.061    0.020   54.326    0.000    0.785    0.785
    swb11             1.076    0.021   50.727    0.000    0.796    0.796
    swb12             0.954    0.022   42.924    0.000    0.706    0.706
    swb13             0.713    0.026   27.920    0.000    0.527    0.527
    swb14             0.946    0.020   46.765    0.000    0.700    0.700
    swb15             1.064    0.020   52.990    0.000    0.787    0.787
    swb16             1.009    0.019   54.088    0.000    0.747    0.747
    swb17             1.066    0.020   52.085    0.000    0.789    0.789
    swb18             0.840    0.024   35.203    0.000    0.622    0.622

Thresholds:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    swb1|t1          -2.252    0.088  -25.646    0.000   -2.252   -2.252
    swb1|t2          -1.076    0.039  -27.312    0.000   -1.076   -1.076
    swb1|t3          -0.102    0.032   -3.213    0.001   -0.102   -0.102
    swb1|t4           1.117    0.040   27.866    0.000    1.117    1.117
    swb2|t1          -1.949    0.067  -29.073    0.000   -1.949   -1.949
    swb2|t2          -0.847    0.036  -23.368    0.000   -0.847   -0.847
    swb2|t3          -0.020    0.032   -0.633    0.527   -0.020   -0.020
    swb2|t4           1.082    0.039   27.392    0.000    1.082    1.082
    swb3|t1          -2.613    0.129  -20.271    0.000   -2.613   -2.613
    swb3|t2          -1.627    0.053  -30.772    0.000   -1.627   -1.627
    swb3|t3          -0.886    0.037  -24.149    0.000   -0.886   -0.886
    swb3|t4           0.316    0.032    9.776    0.000    0.316    0.316
    swb4|t1          -2.341    0.096  -24.399    0.000   -2.341   -2.341
    swb4|t2          -1.189    0.041  -28.723    0.000   -1.189   -1.189
    swb4|t3          -0.075    0.032   -2.353    0.019   -0.075   -0.075
    swb4|t4           1.212    0.042   28.968    0.000    1.212    1.212
    swb5|t1          -2.044    0.073  -28.162    0.000   -2.044   -2.044
    swb5|t2          -0.908    0.037  -24.557    0.000   -0.908   -0.908
    swb5|t3           0.191    0.032    5.993    0.000    0.191    0.191
    swb5|t4           1.257    0.043   29.397    0.000    1.257    1.257
    swb6|t1          -2.232    0.086  -25.911    0.000   -2.232   -2.232
    swb6|t2          -1.151    0.041  -28.285    0.000   -1.151   -1.151
    swb6|t3          -0.133    0.032   -4.174    0.000   -0.133   -0.133
    swb6|t4           0.948    0.037   25.272    0.000    0.948    0.948
    swb7|t1          -2.317    0.094  -24.744    0.000   -2.317   -2.317
    swb7|t2          -1.312    0.044  -29.847    0.000   -1.312   -1.312
    swb7|t3          -0.208    0.032   -6.498    0.000   -0.208   -0.208
    swb7|t4           0.978    0.038   25.798    0.000    0.978    0.978
    swb8|t1          -2.294    0.092  -25.065    0.000   -2.294   -2.294
    swb8|t2          -1.105    0.040  -27.710    0.000   -1.105   -1.105
    swb8|t3          -0.038    0.032   -1.189    0.234   -0.038   -0.038
    swb8|t4           1.145    0.041   28.210    0.000    1.145    1.145
    swb9|t1          -1.852    0.062  -29.836    0.000   -1.852   -1.852
    swb9|t2          -0.602    0.034  -17.755    0.000   -0.602   -0.602
    swb9|t3           0.534    0.033   15.978    0.000    0.534    0.534
    swb9|t4           1.542    0.050   30.794    0.000    1.542    1.542
    swb10|t1         -1.664    0.054  -30.703    0.000   -1.664   -1.664
    swb10|t2         -0.645    0.034  -18.832    0.000   -0.645   -0.645
    swb10|t3          0.311    0.032    9.625    0.000    0.311    0.311
    swb10|t4          1.375    0.045   30.257    0.000    1.375    1.375
    swb11|t1         -2.018    0.071  -28.422    0.000   -2.018   -2.018
    swb11|t2         -1.099    0.040  -27.631    0.000   -1.099   -1.099
    swb11|t3         -0.279    0.032   -8.668    0.000   -0.279   -0.279
    swb11|t4          0.598    0.034   17.657    0.000    0.598    0.598
    swb12|t1         -2.423    0.104  -23.194    0.000   -2.423   -2.423
    swb12|t2         -1.367    0.045  -30.210    0.000   -1.367   -1.367
    swb12|t3         -0.092    0.032   -2.909    0.004   -0.092   -0.092
    swb12|t4          1.082    0.039   27.392    0.000    1.082    1.082
    swb13|t1         -2.070    0.074  -27.876    0.000   -2.070   -2.070
    swb13|t2         -1.236    0.042  -29.203    0.000   -1.236   -1.236
    swb13|t3         -0.183    0.032   -5.741    0.000   -0.183   -0.183
    swb13|t4          1.029    0.039   26.611    0.000    1.029    1.029
    swb14|t1         -1.994    0.070  -28.659    0.000   -1.994   -1.994
    swb14|t2         -0.915    0.037  -24.692    0.000   -0.915   -0.915
    swb14|t3          0.105    0.032    3.314    0.001    0.105    0.105
    swb14|t4          1.275    0.043   29.553    0.000    1.275    1.275
    swb15|t1         -1.718    0.056  -30.542    0.000   -1.718   -1.718
    swb15|t2         -0.842    0.036  -23.275    0.000   -0.842   -0.842
    swb15|t3          0.112    0.032    3.516    0.000    0.112    0.112
    swb15|t4          1.099    0.040   27.631    0.000    1.099    1.099
    swb16|t1         -1.834    0.061  -29.952    0.000   -1.834   -1.834
    swb16|t2         -0.901    0.037  -24.422    0.000   -0.901   -0.901
    swb16|t3          0.075    0.032    2.353    0.019    0.075    0.075
    swb16|t4          1.179    0.041   28.616    0.000    1.179    1.179
    swb17|t1         -2.098    0.076  -27.561    0.000   -2.098   -2.098
    swb17|t2         -1.260    0.043  -29.429    0.000   -1.260   -1.260
    swb17|t3         -0.323    0.032   -9.977    0.000   -0.323   -0.323
    swb17|t4          0.756    0.035   21.440    0.000    0.756    0.756
    swb18|t1         -1.658    0.054  -30.718    0.000   -1.658   -1.658
    swb18|t2         -0.828    0.036  -22.996    0.000   -0.828   -0.828
    swb18|t3          0.010    0.032    0.329    0.742    0.010    0.010
    swb18|t4          1.034    0.039   26.695    0.000    1.034    1.034

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .swb1              0.453                               0.453    0.453
   .swb2              0.660                               0.660    0.660
   .swb3              0.794                               0.794    0.794
   .swb4              0.510                               0.510    0.510
   .swb5              0.512                               0.512    0.512
   .swb6              0.517                               0.517    0.517
   .swb7              0.310                               0.310    0.310
   .swb8              0.229                               0.229    0.229
   .swb9              0.323                               0.323    0.323
   .swb10             0.384                               0.384    0.384
   .swb11             0.366                               0.366    0.366
   .swb12             0.502                               0.502    0.502
   .swb13             0.722                               0.722    0.722
   .swb14             0.511                               0.511    0.511
   .swb15             0.380                               0.380    0.380
   .swb16             0.442                               0.442    0.442
   .swb17             0.378                               0.378    0.378
   .swb18             0.614                               0.614    0.614
    f1                0.547    0.018   29.873    0.000    1.000    1.000

Before looking closer at the results and making comparisons to the published/reported metrics, we need to address the issue of reporting the correct, scaled model fit metrics.

The often used Hu & Bentler (1999) cutoff values (also used in the paper) are based on simulations of continuous data and ML estimation. As such, they are not appropriate for ordinal data analyzed with the WLSMV estimator (McNeish 2023; Savalei 2018). The R-package dynamic can produce appropriate cutoff values for model fit indices. We’ll get into that after reviewing the scaled fit metrics and modification indices.

1.1 Scaled fit metrics

For WLSMV, the .scaled metrics should be reported.

Code
fit_metrics_scaled <- c("chisq.scaled", "df", "pvalue.scaled", 
                        "cfi.scaled", "tli.scaled", "rmsea.scaled", 
                        "rmsea.ci.lower.scaled","rmsea.ci.upper.scaled",
                        "srmr")

fitmeasures(cfamodel2, fit_metrics_scaled) %>% 
  rbind() %>% 
  as.data.frame() %>% 
  mutate(across(where(is.numeric),~ round(.x, 3))) %>%
  rename(Chi2.scaled = chisq.scaled,
         p.scaled = pvalue.scaled,
         CFI.scaled = cfi.scaled,
         TLI.scaled = tli.scaled,
         RMSEA.scaled = rmsea.scaled,
         CI_low.scaled = rmsea.ci.lower.scaled,
         CI_high.scaled = rmsea.ci.upper.scaled,
         SRMR = srmr) %>% 
  knitr::kable()
Chi2.scaled df p.scaled CFI.scaled TLI.scaled RMSEA.scaled CI_low.scaled CI_high.scaled SRMR
. 2940.978 135 0 0.93 0.921 0.115 0.112 0.119 0.06

Again, these were the metrics reported in the paper:

A single-factor solution for the Confirmatory factor analysis for the Questionnaire on Well-Being. QWB resulted in a good fit for the data: χ2(135) = 603.03, p < 0.001, CFI = 0.988, SRMR = 0.053, RMSEA = 0.047 [90% CI: 0.043, 0.051]. Thus, our single- factor model for the QWB exhibits all of our predetermined criteria for a good model fit.

The differences from the model fit metrics output in the table above and those found in the paper are partially due to the missing ordered = TRUE option, but also from reporting the wrong metrics for the WLSMV estimator.

The correct model fit metrics indicate problems, no matter which cutoffs one would use, especially regarding RMSEA. Let us review the modification indices.

1.2 Modification indices

We’ll filter the list and only present those with mi/χ2 > 30.

Code
modificationIndices(cfamodel2,
                    standardized = T) %>% 
  as.data.frame(row.names = NULL) %>% 
  filter(mi > 30) %>% 
  arrange(desc(mi)) %>% 
  mutate(across(where(is.numeric),~ round(.x, 3))) %>%
  knitr::kable()
lhs op rhs mi epc sepc.lv sepc.all sepc.nox
swb4 ~~ swb5 213.550 -0.242 -0.242 -0.474 -0.474
swb5 ~~ swb12 155.477 -0.211 -0.211 -0.416 -0.416
swb7 ~~ swb8 132.329 -0.143 -0.143 -0.536 -0.536
swb11 ~~ swb17 102.493 -0.144 -0.144 -0.388 -0.388
swb1 ~~ swb2 78.974 -0.172 -0.172 -0.315 -0.315
swb1 ~~ swb16 68.421 -0.138 -0.138 -0.309 -0.309
swb2 ~~ swb16 68.162 -0.161 -0.161 -0.298 -0.298
swb5 ~~ swb6 63.609 -0.146 -0.146 -0.284 -0.284
swb15 ~~ swb17 62.165 -0.117 -0.117 -0.308 -0.308
swb12 ~~ swb13 58.963 -0.162 -0.162 -0.269 -0.269
swb13 ~~ swb14 48.027 -0.148 -0.148 -0.244 -0.244
swb9 ~~ swb18 42.649 -0.120 -0.120 -0.269 -0.269
swb4 ~~ swb12 39.141 -0.118 -0.118 -0.233 -0.233
swb2 ~~ swb17 34.139 0.137 0.137 0.275 0.275
swb8 ~~ swb12 33.173 0.123 0.123 0.363 0.363
swb2 ~~ swb3 31.034 -0.132 -0.132 -0.182 -0.182

Many very large mi/χ2 values due to residual correlations.

1.3 Dynamic cutoff values

In order to establish useful cutoff values for the WLSMV estimator with ordinal data, we need to run simulations relevant to the current set of items and response data (McNeish 2023). This has been implemented in the development version of dynamic.

Code
library(dynamic) # devtools::install_github("melissagwolf/dynamic") for development version
Beta version. Please report bugs: https://github.com/melissagwolf/dynamic/issues.
Code
dyncut <- catOne(cfamodel2, reps = 500)
Code
dyncut
Your DFI cutoffs: 
            SRMR  RMSEA CFI  
Level-0     0.015 0.011 0.999
Specificity 95%   95%   95%  
                             
Level-1     0.026 0.04  0.991
Sensitivity 95%   95%   95%  
                             
Level-2     0.031 0.056 0.983
Sensitivity 95%   95%   95%  
                             
Level-3     0.034 0.066 0.977
Sensitivity 95%   95%   95%  

Empirical fit indices: 
 Chi-Square  df p-value   SRMR   RMSEA    CFI
   2940.978 135       0   0.06   0.115   0.93

 Notes:
  -'Sensitivity' is % of hypothetically misspecified models correctly identified by cutoff in DFI simulation
  -Cutoffs with 95% sensitivity are reported when possible
  -If sensitivity is <50%, cutoffs will be supressed 

Explanations on Levels 0-3 from the dynamic package vignette:

When there are 6 or more items, cfaOne will consider three levels of misspecification. As in catHB, the Level-0 row corresponds to the anticipated fit index values if the fitted model were the exact underlying population model. The Level-1 row corresponds to the anticipated fit index values if the fitted model omitted 0.30 residual correlations between approximately 1/3 of item pairs. The Level-2 row corresponds to the anticipated fit index values if the fitted model omitted 0.30 residual correlations between approximately 1/3 of item pairs. The Level-3 row corresponds to the anticipated fit index values if the fitted model omitted 0.30 residual correlations between all item pairs.

As we can see, the observed/empirical fit metrics from the data does not come close to the Level-3 simulation based cutoff values.

2 Summary comments

The 18 items do not fit a unidimensional model, due to issues with residual correlations and potential multidimensionality.

3 Exploratory factor analysis

Let’s look at the data using EFA. A lot of the code for this analysis was borrowed from https://solomonkurz.netlify.app/blog/2021-05-11-yes-you-can-fit-an-exploratory-factor-analysis-with-lavaan/

Code
f1 <- 'efa("efa")*f1 =~ swb1 + swb2 + swb3 + swb4 + swb5 + swb6 + swb7 + swb8 +
                    swb9 + swb10 + swb11 + swb12 + swb13 + swb14 + swb15 + 
                    swb16 + swb17 + swb18'

# 2-factor model
f2 <- 'efa("efa")*f1 + 
       efa("efa")*f2 =~ swb1 + swb2 + swb3 + swb4 + swb5 + swb6 + swb7 + swb8 +
                    swb9 + swb10 + swb11 + swb12 + swb13 + swb14 + swb15 + 
                    swb16 + swb17 + swb18'

# 3-factor
f3 <- '
efa("efa")*f1 +
efa("efa")*f2 +
efa("efa")*f3 =~ swb1 + swb2 + swb3 + swb4 + swb5 + swb6 + swb7 + swb8 +
                    swb9 + swb10 + swb11 + swb12 + swb13 + swb14 + swb15 + 
                    swb16 + swb17 + swb18'

# 4-factor
f4 <- '
efa("efa")*f1 +
efa("efa")*f2 +
efa("efa")*f3 +
efa("efa")*f4 =~ swb1 + swb2 + swb3 + swb4 + swb5 + swb6 + swb7 + swb8 +
                    swb9 + swb10 + swb11 + swb12 + swb13 + swb14 + swb15 + 
                    swb16 + swb17 + swb18'

# 5-factor
f5 <- '
efa("efa")*f1 +
efa("efa")*f2 +
efa("efa")*f3 +
efa("efa")*f4 + 
efa("efa")*f5 =~ swb1 + swb2 + swb3 + swb4 + swb5 + swb6 + swb7 + swb8 +
                    swb9 + swb10 + swb11 + swb12 + swb13 + swb14 + swb15 + 
                    swb16 + swb17 + swb18'

efa_f1 <- 
  cfa(model = f1,
      data = dd,
      rotation = "oblimin",
      estimator = "WLSMV",
      ordered = TRUE)
efa_f2 <- 
  cfa(model = f2,
      data = dd,
      rotation = "oblimin",
      estimator = "WLSMV",
      ordered = TRUE)
efa_f3 <- 
  cfa(model = f3,
      data = dd,
      rotation = "oblimin",
      estimator = "WLSMV",
      ordered = TRUE)
efa_f4 <- 
  cfa(model = f4,
      data = dd,
      rotation = "oblimin",
      estimator = "WLSMV",
      ordered = TRUE)
efa_f5 <- 
  cfa(model = f5,
      data = dd,
      rotation = "oblimin",
      estimator = "WLSMV",
      ordered = TRUE)

3.1 Model fit table

Code
rbind(
  fitmeasures(efa_f1, fit_metrics_scaled),
  fitmeasures(efa_f2, fit_metrics_scaled),
  fitmeasures(efa_f3, fit_metrics_scaled),
  fitmeasures(efa_f4, fit_metrics_scaled),
  fitmeasures(efa_f5, fit_metrics_scaled)
  ) %>% 
  as.data.frame() %>% 
  mutate(across(where(is.numeric),~ round(.x, 3))) %>%
  rename(Chi2.scaled = chisq.scaled,
         p.scaled = pvalue.scaled,
         CFI.scaled = cfi.scaled,
         TLI.scaled = tli.scaled,
         RMSEA.scaled = rmsea.scaled,
         CI_low.scaled = rmsea.ci.lower.scaled,
         CI_high.scaled = rmsea.ci.upper.scaled,
         SRMR = srmr) %>% 
  add_column(Model = paste0(1:5,"-factor"), .before = "Chi2.scaled") %>% 
  knitr::kable()
Model Chi2.scaled df p.scaled CFI.scaled TLI.scaled RMSEA.scaled CI_low.scaled CI_high.scaled SRMR
1-factor 2940.978 135 0 0.930 0.921 0.115 0.112 0.119 0.060
2-factor 1984.981 118 0 0.954 0.940 0.101 0.097 0.105 0.047
3-factor 1255.923 102 0 0.971 0.957 0.085 0.081 0.089 0.033
4-factor 858.626 87 0 0.981 0.966 0.075 0.071 0.080 0.026
5-factor 441.334 73 0 0.991 0.981 0.057 0.052 0.062 0.018

3.2 Plot 4-factor EFA

Code
standardizedsolution(efa_f4) %>% 
  filter(op == "=~") %>% 
  mutate(item  = str_remove(rhs, "swb") %>% as.double(),
         factor = str_remove(lhs, "f")) %>% 
  # plot
  ggplot(aes(x = est.std, xmin = ci.lower, xmax = ci.upper, y = item)) +
  annotate(geom = "rect",
           xmin = -1, xmax = 1,
           ymin = -Inf, ymax = Inf,
           fill = "grey90") +
  annotate(geom = "rect",
           xmin = -0.7, xmax = 0.7,
           ymin = -Inf, ymax = Inf,
           fill = "grey93") +
  annotate(geom = "rect",
           xmin = -0.4, xmax = 0.4,
           ymin = -Inf, ymax = Inf,
           fill = "grey96") +
  geom_vline(xintercept = 0, color = "white") +
  geom_pointrange(aes(alpha = abs(est.std) < 0.4),
                  fatten = 10) +
  geom_text(aes(label = item, color = abs(est.std) < 0.4),
            size = 4) +
  scale_color_manual(values = c("white", "transparent")) +
  scale_alpha_manual(values = c(1, 1/3)) +
  scale_x_continuous(expression(lambda[standardized]), 
                     expand = c(0, 0), limits = c(-1, 1),
                     breaks = c(-1, -0.7, -0.4, 0, 0.4, 0.7, 1),
                     labels = c("-1", "-.7", "-.4", "0", ".4", ".7", "1")) +
  scale_y_continuous(breaks = 1:18, sec.axis = sec_axis(~ . * 1, breaks = 1:18)) +
  ggtitle("Factor loadings for the 4-factor model") +
  theme(legend.position = "none") +
  facet_wrap(~ factor, labeller = label_both) 

3.3 EFA comments

As we saw in the CFA modification indices, I think most issues stem from residual correlations - some items are too similar and one in each correlated pair needs to be removed.

Looking at the 4-factor solution, we have one factor with 1 item, two with 4 items, and one with 7 items.

Let’s review the 7 items with standardized loadings > 0.4 from the factor with most items in the 4-factor solution.

Code
items <- standardizedsolution(efa_f4) %>% 
  filter(op == "=~",
         lhs == "f3",
         est.std > 0.4) %>% 
  pull(rhs)

itemlabels <- read_csv("data/itemlabels_swb.csv") %>% 
  mutate(itemnr = paste0("swb",1:18))

standardizedsolution(efa_f4) %>% 
  filter(op == "=~",
         lhs == "f3",
         est.std > 0.4) %>% 
  arrange(desc(est.std)) %>% 
  mutate_if(is.numeric, ~ round(.x, 3)) %>% 
  dplyr::select(!c(lhs,op,z,pvalue)) %>% 
  dplyr::rename(itemnr = rhs,
                loading = est.std) %>% 
  left_join(itemlabels, by = "itemnr") %>% 
  knitr::kable()
itemnr loading se ci.lower ci.upper item
swb11 0.868 0.024 0.821 0.914 … have been able to be focused and concentrated on today’s tasks?
swb8 0.838 0.026 0.786 0.889 … felt satisfied with your life in its present situation?
swb17 0.793 0.029 0.736 0.849 … had the power to recover one’s strength if something has been stressful or difficult?
swb7 0.779 0.026 0.727 0.830 … been able to object and to assert yourself when this is needed?
swb15 0.652 0.032 0.589 0.715 … been able to make decisions and carry them out?
swb10 0.552 0.030 0.493 0.611 … slept well and got the right amount of sleep?
swb9 0.419 0.039 0.343 0.496 … felt that your life is meaningful?

4 Brief Rasch analysis

For fun, let’s see how the 7 items from the EFA above work as a unidimensional scale using Rasch Measurement Theory.

Code
library(RISEkbmRasch) # install first with `devtools::install_github("pgmj/RISEkbmRasch")`
df <- dd %>% 
  dplyr::select(all_of(items)) %>% 
  na.omit()

The eRm package, which uses Conditional Maximum Likelihood (CML) estimation, will be used primarily. For this analysis of polytomous data, the Partial Credit Model will be used.

itemnr item
swb7 … been able to object and to assert yourself when this is needed?
swb8 … felt satisfied with your life in its present situation?
swb9 … felt that your life is meaningful?
swb10 … slept well and got the right amount of sleep?
swb11 … have been able to be focused and concentrated on today’s tasks?
swb15 … been able to make decisions and carry them out?
swb17 … had the power to recover one’s strength if something has been stressful or difficult?
Code
simfit1 <- RIgetfit(df, iterations = 500, cpu = 8) 
RIitemfit(df, simfit1)
Item InfitMSQ Infit thresholds OutfitMSQ Outfit thresholds Infit diff Outfit diff
swb7 0.927 [0.921, 1.082] 0.933 [0.92, 1.084] no misfit no misfit
swb8 0.758 [0.923, 1.073] 0.749 [0.908, 1.106] 0.165 0.159
swb9 1.1 [0.926, 1.076] 1.094 [0.919, 1.084] 0.024 0.01
swb10 1.09 [0.931, 1.084] 1.087 [0.912, 1.096] 0.006 no misfit
swb11 0.997 [0.922, 1.074] 0.966 [0.918, 1.076] no misfit no misfit
swb15 1.083 [0.931, 1.072] 1.115 [0.928, 1.081] 0.011 0.034
swb17 1.042 [0.917, 1.08] 1.069 [0.919, 1.083] no misfit no misfit
Note:
MSQ values based on conditional calculations (n = 1561 complete cases).
Simulation based thresholds from 500 simulated datasets.
Code
RIgetfitPlot(simfit1, df)

Code
ICCplot(as.data.frame(df), 
        itemnumber = 2, 
        method = "cut", 
        itemdescrip = c("item 8"))

Code
### also suggested:
library(RASCHplot) # install first with `devtools::install_github("ERRTG/RASCHplot")`
CICCplot(PCM(df),
         which.item = 2,
         lower.groups = c(0,6,12,17,23),
         grid.items = FALSE)
$swb8

Code
      observed expected se     pvalue padj.BH sig 
swb7  0.7645   0.7290   0.0134 0.0080 0.0280  *   
swb8  0.8209   0.7305   0.0115 0.0000 0.0000  *** 
swb9  0.6951   0.7285   0.0153 0.0286 0.0667  .   
swb10 0.7091   0.7321   0.0144 0.1114 0.1949      
swb11 0.7386   0.7328   0.0134 0.6613 0.6613      
swb15 0.7168   0.7337   0.0135 0.2119 0.2967      
swb17 0.7202   0.7293   0.0146 0.5317 0.6203      
Code

PCA of Rasch model residuals

Eigenvalues Proportion of variance
1.63 24%
1.45 20.7%
1.21 17.2%
0.99 14.8%
0.88 13.1%
Code
simcor1 <- RIgetResidCor(df, iterations = 500, cpu = 8)
RIresidcorr(df, cutoff = simcor1$p99)
swb7 swb8 swb9 swb10 swb11 swb15 swb17
swb7
swb8 0.16
swb9 -0.12 0.02
swb10 -0.23 -0.21 -0.02
swb11 -0.14 -0.17 -0.23 -0.18
swb15 -0.25 -0.21 -0.23 -0.11 -0.24
swb17 -0.16 -0.2 -0.34 -0.26 0.01 0
Note:
Relative cut-off value (highlighted in red) is -0.06, which is 0.088 above the average correlation (-0.148).
Code

Code
mirt(df, model=1, itemtype='Rasch', verbose = FALSE) %>% 
  plot(type="trace", as.table = TRUE, 
       theta_lim = c(-8,8))

Code
# increase fig-height above as needed, if you have many items
RItargeting(df)

Code

Code
iarm::score_groups(as.data.frame(df)) %>% 
  as.data.frame(nm = "score_group") %>% 
  dplyr::count(score_group)
  score_group   n
1           1 786
2           2 775
Code
dif_plots <- df %>% 
  add_column(dif = iarm::score_groups(.)) %>% 
  mutate(dif = factor(dif, labels = c("Below median score","Above median score"))) %>% 
  split(.$dif) %>% # split the data using the DIF variable
  map(~ RItileplot(.x %>% dplyr::select(!dif)) + labs(title = .x$dif))
dif_plots[[1]] + dif_plots[[2]]

4.1 Rasch analysis comments

Item 8 shows misfit and has a residual correlation with item 7. We’ll remove it and run the analysis again.

We have no demographic information, which makes invariance/DIF difficult to evaluate. I tried splitting the data into score groups based on median score, but the high scoring group had too much missing data in lower response categories for analysis to be feasible.

Code
df$swb8 <- NULL

5 Rasch analysis 2

itemnr item
swb7 … been able to object and to assert yourself when this is needed?
swb9 … felt that your life is meaningful?
swb10 … slept well and got the right amount of sleep?
swb11 … have been able to be focused and concentrated on today’s tasks?
swb15 … been able to make decisions and carry them out?
swb17 … had the power to recover one’s strength if something has been stressful or difficult?
Code
simfit2 <- RIgetfit(df, iterations = 500, cpu = 8) 
RIitemfit(df, simfit2)
Item InfitMSQ Infit thresholds OutfitMSQ Outfit thresholds Infit diff Outfit diff
swb7 0.955 [0.916, 1.073] 0.959 [0.918, 1.072] no misfit no misfit
swb9 1.092 [0.919, 1.072] 1.091 [0.912, 1.102] 0.02 no misfit
swb10 1.018 [0.922, 1.092] 1.015 [0.918, 1.094] no misfit no misfit
swb11 0.941 [0.935, 1.086] 0.906 [0.913, 1.093] no misfit 0.007
swb15 1.011 [0.931, 1.08] 1.032 [0.928, 1.086] no misfit no misfit
swb17 0.974 [0.927, 1.073] 0.99 [0.924, 1.075] no misfit no misfit
Note:
MSQ values based on conditional calculations (n = 1561 complete cases).
Simulation based thresholds from 500 simulated datasets.
Code
RIgetfitPlot(simfit2, df)

Code
ICCplot(as.data.frame(df), 
        itemnumber = c(2,4), 
        method = "cut", 
        itemdescrip = c("item 9","item 11"))

[1] "Please press Zoom on the Plots window to see the plot"
Code
### also suggested:
library(RASCHplot) # install first with `devtools::install_github("ERRTG/RASCHplot")`
CICCplot(PCM(df),
         which.item = c(2,4),
         lower.groups = c(0,6,12,18),
         grid.items = TRUE)

Code
      observed expected se     pvalue padj.BH sig
swb7  0.7380   0.7090   0.0144 0.0442 0.1326     
swb9  0.6776   0.7096   0.0159 0.0442 0.1326     
swb10 0.7128   0.7135   0.0144 0.9641 0.9641     
swb11 0.7388   0.7146   0.0134 0.0716 0.1432     
swb15 0.7188   0.7147   0.0139 0.7716 0.9260     
swb17 0.7238   0.7103   0.0146 0.3544 0.5316     
Code

PCA of Rasch model residuals

Eigenvalues Proportion of variance
1.56 26.8%
1.36 22.3%
1.17 18.9%
0.98 17%
0.92 14.9%
Code
simcor2 <- RIgetResidCor(df, iterations = 500, cpu = 8)
RIresidcorr(df, cutoff = simcor2$p99)
swb7 swb9 swb10 swb11 swb15 swb17
swb7
swb9 -0.08
swb10 -0.22 -0.03
swb11 -0.12 -0.23 -0.22
swb15 -0.23 -0.24 -0.15 -0.28
swb17 -0.15 -0.35 -0.3 -0.02 -0.04
Note:
Relative cut-off value (highlighted in red) is -0.083, which is 0.095 above the average correlation (-0.177).
Code

Code
# increase fig-height above as needed, if you have many items
RItargeting(df)

Code

Code
RItif(df, cutoff = 2.5, samplePSI = TRUE)

Code
Separation Reliability: 0.8774
Code
RIpfit(df)

Code
Threshold 1 Threshold 2 Threshold 3 Threshold 4 Item location
swb7 -4.73 -2.00 0.58 3.39 -0.69
swb9 -3.45 -0.31 2.38 4.73 0.84
swb10 -2.80 -0.37 1.77 4.34 0.74
swb11 -3.69 -1.33 0.44 2.32 -0.56
swb15 -2.81 -0.86 1.34 3.59 0.32
swb17 -3.82 -1.81 0.30 2.80 -0.63
Note:
Item location is the average of the thresholds for each item.
Code
RIscoreSE(df, output = "figure")

Code
Ordinal sum score Logit score Logit std.error
0 -6.354 0.721
1 -5.025 0.884
2 -4.285 0.827
3 -3.717 0.753
4 -3.228 0.701
5 -2.781 0.670
6 -2.357 0.650
7 -1.948 0.638
8 -1.550 0.629
9 -1.159 0.623
10 -0.775 0.619
11 -0.397 0.617
12 -0.021 0.616
13 0.355 0.618
14 0.736 0.622
15 1.124 0.627
16 1.520 0.635
17 1.926 0.645
18 2.343 0.658
19 2.778 0.679
20 3.241 0.711
21 3.750 0.763
22 4.339 0.837
23 5.095 0.892
24 6.425 0.723

5.1 Rasch analysis comments

A set of items with decent psychometric properties. Reliability could be higher and targeting shows a minor ceiling effect, which all well-being questionnaires seem to have to some degree.

Code
itemnr item
swb7 ... been able to object and to assert yourself when this is needed?
swb9 ... felt that your life is meaningful?
swb10 ... slept well and got the right amount of sleep?
swb11 ... have been able to be focused and concentrated on today’s tasks?
swb15 ... been able to make decisions and carry them out?
swb17 ... had the power to recover one’s strength if something has been stressful or difficult?

6 Software used

Code
R version 4.4.1 (2024-06-14)
Platform: aarch64-apple-darwin20
Running under: macOS Sonoma 14.6.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Stockholm
tzcode source: internal

attached base packages:
 [1] parallel  grid      stats4    stats     graphics  grDevices utils    
 [8] datasets  methods   base     

other attached packages:
 [1] RASCHplot_0.1.0      ggdist_3.3.2         doParallel_1.0.17   
 [4] iterators_1.0.14     foreach_1.5.2        RISEkbmRasch_0.2.4.2
 [7] janitor_2.2.0        iarm_0.4.3           hexbin_1.28.4       
[10] catR_3.17            glue_1.7.0           ggrepel_0.9.6       
[13] reshape_0.8.9        matrixStats_1.3.0    psychotree_0.16-1   
[16] psychotools_0.7-4    partykit_1.2-22      mvtnorm_1.3-1       
[19] libcoin_1.0-10       psych_2.4.6.26       mirt_1.42           
[22] lattice_0.22-6       eRm_1.0-6            kableExtra_1.4.0    
[25] formattable_0.2.1    knitr_1.48           dynamic_1.1.0       
[28] patchwork_1.2.0      lavaan_0.6-18        lubridate_1.9.3     
[31] forcats_1.0.0        stringr_1.5.1        dplyr_1.1.4         
[34] purrr_1.0.2          readr_2.1.5          tidyr_1.3.1         
[37] tibble_3.2.1         ggplot2_3.5.1        tidyverse_2.0.0     
[40] readxl_1.4.3        

loaded via a namespace (and not attached):
  [1] later_1.3.2          splines_4.4.1        R.oo_1.26.0         
  [4] cellranger_1.1.0     rpart_4.1.23         lifecycle_1.0.4     
  [7] Rdpack_2.6.1         rstatix_0.7.2        rprojroot_2.0.4     
 [10] globals_0.16.3       MASS_7.3-61          backports_1.5.0     
 [13] magrittr_2.0.3       vcd_1.4-12           Hmisc_5.1-3         
 [16] rmarkdown_2.28       yaml_2.3.10          httpuv_1.6.15       
 [19] sessioninfo_1.2.2    cowplot_1.1.3        RColorBrewer_1.1-3  
 [22] pbapply_1.7-2        abind_1.4-5          audio_0.1-11        
 [25] quadprog_1.5-8       R.utils_2.12.3       nnet_7.3-19         
 [28] listenv_0.9.1        GenOrd_1.4.0         testthat_3.2.1.1    
 [31] RPushbullet_0.3.4    vegan_2.6-8          parallelly_1.38.0   
 [34] svglite_2.1.3        permute_0.9-7        codetools_0.2-20    
 [37] DT_0.33              xml2_1.3.6           tidyselect_1.2.1    
 [40] farver_2.1.2         base64enc_0.1-3      jsonlite_1.8.8      
 [43] progressr_0.14.0     Formula_1.2-5        survival_3.7-0      
 [46] systemfonts_1.1.0    tools_4.4.1          gnm_1.1-5           
 [49] snow_0.4-4           Rcpp_1.0.13          mnormt_2.1.1        
 [52] gridExtra_2.3        xfun_0.46            here_1.0.1          
 [55] mgcv_1.9-1           distributional_0.4.0 Bayesrel_0.7.7      
 [58] ca_0.71.1            withr_3.0.1          beepr_2.0           
 [61] fastmap_1.2.0        fansi_1.0.6          digest_0.6.37       
 [64] mime_0.12            timechange_0.3.0     R6_2.5.1            
 [67] colorspace_2.1-1     simstandard_0.6.3    R.methodsS3_1.8.2   
 [70] inum_1.0-5           utf8_1.2.4           generics_0.1.3      
 [73] data.table_1.15.4    SimDesign_2.17.1     htmlwidgets_1.6.4   
 [76] semTools_0.5-6       pkgconfig_2.0.3      gtable_0.3.5        
 [79] lmtest_0.9-40        brio_1.1.5           htmltools_0.5.8.1   
 [82] carData_3.0-5        scales_1.3.0         corrplot_0.92       
 [85] snakecase_0.11.1     rstudioapi_0.16.0    reshape2_1.4.4      
 [88] tzdb_0.4.0           checkmate_2.3.2      nlme_3.1-165        
 [91] curl_5.2.2           cachem_1.1.0         zoo_1.8-12          
 [94] relimp_1.0-5         vcdExtra_0.8-5       foreign_0.8-87      
 [97] pillar_1.9.0         vctrs_0.6.5          promises_1.3.0      
[100] ggpubr_0.6.0         car_3.1-2            xtable_1.8-4        
[103] Deriv_4.1.3          cluster_2.1.6        dcurver_0.9.2       
[106] GPArotation_2024.3-1 htmlTable_2.4.3      evaluate_0.24.0     
[109] pbivnorm_0.6.0       cli_3.6.3            compiler_4.4.1      
[112] rlang_1.1.4          future.apply_1.11.2  ggsignif_0.6.4      
[115] labeling_0.4.3       plyr_1.8.9           stringi_1.8.4       
[118] viridisLite_0.4.2    munsell_0.5.1        Matrix_1.7-0        
[121] qvcalc_1.0.3         hms_1.1.3            future_1.34.0       
[124] shiny_1.9.1          rbibutils_2.2.16     memoise_2.0.1       
[127] broom_1.0.6          ggstance_0.3.7      

7 References

Hlynsson, Jón Ingi, Anders Sjöberg, Lars Ström, and Per Carlbring. 2024. “Evaluating the Reliability and Validity of the Questionnaire on Well-Being: A Validation Study for a Clinically Informed Measurement of Subjective Well-Being.” Cognitive Behaviour Therapy 0 (0): 1–23. https://doi.org/10.1080/16506073.2024.2402992.
Hu, Li‐tze, and Peter M. Bentler. 1999. “Cutoff Criteria for Fit Indexes in Covariance Structure Analysis: Conventional Criteria Versus New Alternatives.” Structural Equation Modeling: A Multidisciplinary Journal 6 (1): 1–55. https://doi.org/10.1080/10705519909540118.
McNeish, Daniel. 2023. “Dynamic Fit Index Cutoffs for Categorical Factor Analysis with Likert-Type, Ordinal, or Binary Responses.” American Psychologist 78 (9): 1061–75. https://doi.org/10.1037/amp0001213.
Savalei, Victoria. 2018. “On the Computation of the RMSEA and CFI from the Mean-And-Variance Corrected Test Statistic with Nonnormal Data in SEM.” Multivariate Behavioral Research 53 (3): 419–29. https://doi.org/10.1080/00273171.2018.1455142.

Reuse